3 Jan 2024

Topics

Characteristics of time series (ts)

Classical decomposition

Characteristics of time series

  • Expectation, mean & variance
  • Covariance & correlation
  • Stationarity
  • Autocovariance & autocorrelation
  • Correlograms

What is a time series?

Classification of time series

By some index set

Discrete time; \(x_t\)

  • Equally spaced: \(t = \{1,2,3,4,5\}\)
  • Equally spaced w/ missing value: \(t = \{1,2,4,5,6\}\)
  • Unequally spaced: \(t = \{2,3,4,6,9\}\)

Classification of time series

By the underlying process

Discrete (eg, total # of fish caught per trawl)

Continuous (eg, salinity, temperature)

Classification of time series

By the number of values recorded

Univariate/scalar (eg, total # of fish caught)

Multivariate/vector (eg, # of each spp of fish caught)

Classification of time series

By the type of values recorded

Integer (eg, # of fish in 5 min trawl = 2413)

Real (eg, fish mass = 10.2 g)

Classification of time series

We will focus on integers & real-values in discrete time

Univariate \((x_t)\)


Multivariate \(\begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}_t\)

Analysis of time series

Statistical analyses of time series

Most statistical analyses are concerned with estimating properties of a population from a sample

For example, we use fish caught in a seine to infer the mean size of fish in a lake

Statistical analyses of time series

Time series analysis, however, presents a different situation:

  • Although we could vary the length of an observed time series, it is often impossible to make multiple observations at a given point in time

Statistical analyses of time series

Time series analysis, however, presents a different situation:

  • Although we could vary the length of an observed time series, it is often impossible to make multiple observations at a given point in time

For example, one can’t observe today’s closing price of Microsoft stock more than once

Thus, conventional statistical procedures, based on large sample estimates, are inappropriate

Descriptions of time series

Number of users connected to the internet

Number of users connected to the internet

Descriptions of time series

Number of lynx trapped in Canada from 1821-1934

Number of lynx trapped in Canada from 1821-1934

Classical decomposition

Model time series \(\{x_t\}\) as a combination of

  1. trend (\(m_t\))
  2. seasonal component (\(s_t\))
  3. remainder (\(e_t\))

\(x_t = m_t + s_t + e_t\)

Classical decomposition

1. The trend (\(m_t\))

We need a way to extract the so-called signal from the noise

One common method is via “linear filters”

Linear filters can be thought of as “smoothing” the data

Classical decomposition

1. The trend (\(m_t\))

Linear filters typically take the form

\[ \hat{m}_t = \sum_{i=-\infty}^{\infty} \lambda_i x_{t+1} \]

Classical decomposition

1. The trend (\(m_t\))

For example, a moving average

\[ \hat{m}_t = \sum_{i=-a}^{a} \frac{1}{2a + 1} x_{t+i} \]

Classical decomposition

1. The trend (\(m_t\))

For example, a moving average

\[ \hat{m}_t = \sum_{i=-a}^{a} \frac{1}{2a + 1} x_{t+i} \]

If \(a = 1\), then

\[ \hat{m}_t = \frac{1}{3}(x_{t-1} + x_t + x_{t+1}) \]

Classical decomposition

1. The trend (\(m_t\))

For example, a moving average

\[ \hat{m}_t = \sum_{i=-a}^{a} \frac{1}{2a + 1} x_{t+i} \]

As \(a\) increases, the estimated trend becomes more smooth

Example of linear filtering

Monthly airline passengers from 1949-1960

Monthly airline passengers from 1949-1960

Example of linear filtering

Monthly airline passengers from 1949-1960

Monthly airline passengers from 1949-1960

Example of linear filtering

Monthly airline passengers from 1949-1960

Monthly airline passengers from 1949-1960

Example of linear filtering

Monthly airline passengers from 1949-1960

Monthly airline passengers from 1949-1960

Classical decomposition

2. Seasonal effect (\(s_t\))

Once we have an estimate of the trend \(\hat{m}_t\), we can estimate \(\hat{s}_t\) simply by subtraction:

\[ \hat{s}_t = x_t - \hat{m}_t \]

Classical decomposition

Seasonal effect (\(\hat{s}_t\)), assuming \(\lambda = 1/9\)

Classical decomposition

2. Seasonal effect (\(s_t\))

But, \(\hat{s}_t\) really includes the remainder \(e_t\) as well

\[ \begin{align} \hat{s}_t &= x_t - \hat{m}_t \\ (s_t + e_t) &= x_t - m_t \end{align} \]

Classical decomposition

2. Seasonal effect (\(s_t\))

So we need to estimate the mean seasonal effect as

\[ \hat{s}_{Jan} = \sum \frac{1}{(N/12)} \{s_1, s_{13}, s_{25}, \dots \} \\ \hat{s}_{Feb} = \sum \frac{1}{(N/12)} \{s_2, s_{14}, s_{26}, \dots \} \\ \vdots \\ \hat{s}_{Dec} = \sum \frac{1}{(N/12)} \{s_{12}, s_{24}, s_{36}, \dots \} \\ \]

Mean seasonal effect (\(s_t\))

Classical decomposition

3. Remainder (\(e_t\))

Now we can estimate \(e_t\) via subtraction:

\[ \hat{e}_t = x_t - \hat{m}_t - \hat{s}_t \]

Remainder (\(e_t\))

Let’s try a different model

With some other assumptions

  1. Log-transform data

  2. Linear trend

Log-transformed data

Monthly airline passengers from 1949-1960

Monthly airline passengers from 1949-1960

The trend (\(m_t\))

Seasonal effect (\(s_t\)) with error (\(e_t\))

Mean seasonal effect (\(s_t\))

Remainder (\(e_t\))

Expectation & the mean

The expectation (\(E\)) of a variable is its mean value in the population

\(\text{E}(x) \equiv\) mean of \(x = \mu\)

We can estimate \(\mu\) from a sample as

\[ m = \frac{1}{N} \sum_{i=1}^N{x_i} \]

Variance

\(\text{E}([x - \mu]^2) \equiv\) expected deviations of \(x\) about \(\mu\)

\(\text{E}([x - \mu]^2) \equiv\) variance of \(x = \sigma^2\)

We can estimate \(\sigma^2\) from a sample as

\[ s^2 = \frac{1}{N-1}\sum_{i=1}^N{(x_i - m)^2} \]

Covariance

If we have two variables, \(x\) and \(y\), we can generalize variance

\[ \sigma^2 = \text{E}([x_i - \mu][x_i - \mu]) \]

into covariance

\[ \gamma_{x,y} = \text{E}([x_i - \mu_x][y_i - \mu_y]) \]

Covariance

If we have two variables, \(x\) and \(y\), we can generalize variance

\[ \sigma^2 = \text{E}([x_i - \mu][x_i - \mu]) \]

into covariance

\[ \gamma_{x,y} = \text{E}([x_i - \mu_x][y_i - \mu_y]) \]

We can estimate \(\gamma_{x,y}\) from a sample as

\[ \text{Cov}(x,y) = \frac{1}{N-1}\sum_{i=1}^N{(x_i - m_x)(y_i - m_y)} \]

Graphical example of covariance

Graphical example of covariance

Graphical example of covariance

Correlation

Correlation is a dimensionless measure of the linear association between 2 variables, \(x\) & \(y\)

It is simply the covariance standardized by the standard deviations

\[ \rho_{x,y} = \frac{\gamma_{x,y}}{\sigma_x \sigma_y} \]

\[ -1 < \rho_{x,y} < 1 \]

Correlation

Correlation is a dimensionless measure of the linear association between 2 variables \(x\) & \(y\)

It is simply the covariance standardized by the standard deviations

\[ \rho_{x,y} = \frac{\gamma_{x,y}}{\sigma_x \sigma_y} \]

We can estimate \(\rho_{x,y}\) from a sample as

\[ \text{Cor}(x,y) = \frac{\text{Cov}(x,y)}{s_x s_y} \]

Stationarity & the mean

Consider a single value, \(x_t\)

Stationarity & the mean

Consider a single value, \(x_t\)

\(\text{E}(x_t)\) is taken across an ensemble of all possible time series

Stationarity & the mean

Stationarity & the mean

Our single realization is our estimate!

Our single realization is our estimate!

Stationarity & the mean

If \(\text{E}(x_t)\) is constant across time, we say the time series is stationary in the mean

Stationarity of time series

Stationarity is a convenient assumption that allows us to describe the statistical properties of a time series.

In general, a time series is said to be stationary if there is

  1. no systematic change in the mean or variance
  2. no systematic trend
  3. no periodic variations or seasonality

Identifying stationarity

Identifying stationarity

Our eyes are really bad at identifying stationarity, so we will learn some tools to help us

Autocovariance function (ACVF)

For stationary ts, we define the autocovariance function (\(\gamma_k\)) as

\[ \gamma_k = \text{E}([x_t - \mu][x_{t+k} - \mu]) \]

which means that

\[ \gamma_0 = \text{E}([x_t - \mu][x_{t} - \mu]) = \sigma^2 \]

Autocovariance function (ACVF)

For stationary ts, we define the autocovariance function (\(\gamma_k\)) as

\[ \gamma_k = \text{E}([x_t - \mu][x_{t+k} - \mu]) \]

“Smooth” time series have large ACVF for large \(k\)

“Choppy” time series have ACVF near 0 for small \(k\)

Autocovariance function (ACVF)

For stationary ts, we define the autocovariance function (\(\gamma_k\)) as

\[ \gamma_k = \text{E}([x_t - \mu][x_{t+k} - \mu]) \]

We can estimate \(\gamma_k\) from a sample as

\[ c_k = \frac{1}{N}\sum_{t=1}^{N-k}{(x_t - m)(x_{t+k} - m)} \]

Autocorrelation function (ACF)

The autocorrelation function (ACF) is simply the ACVF normalized by the variance

\[ \rho_k = \frac{\gamma_k}{\sigma^2} = \frac{\gamma_k}{\gamma_0} \]

The ACF measures the correlation of a time series against a time-shifted version of itself

Autocorrelation function (ACF)

The autocorrelation function (ACF) is simply the ACVF normalized by the variance

\[ \rho_k = \frac{\gamma_k}{\sigma^2} = \frac{\gamma_k}{\gamma_0} \]

The ACF measures the correlation of a time series against a time-shifted version of itself

We can estimate ACF from a sample as

\[ r_k = \frac{c_k}{c_0} \]

Properties of the ACF

The ACF has several important properties:

  • \(-1 \leq r_k \leq 1\)
  • \(r_k = r_{-k}\)
  • \(r_k\) of a periodic function is itself periodic
  • \(r_k\) for the sum of 2 independent variables is the sum of \(r_k\) for each of them

The correlogram

Graphical output for the ACF

Graphical output for the ACF

The correlogram

The ACF at lag = 0 is always 1

The ACF at lag = 0 is always 1

The correlogram

Approximate confidence intervals

Approximate confidence intervals

Estimating the ACF in R

acf(ts_object)

ACF for deterministic forms

ACF for deterministic forms

ACF for deterministic forms

ACF for deterministic forms

Induced autocorrelation

Recall the transitive property, whereby

If \(A = B\) and \(B = C\), then \(A = C\)

Induced autocorrelation

Recall the transitive property, whereby

If \(A = B\) and \(B = C\), then \(A = C\)

which suggests that

If \(x \propto y\) and \(y \propto z\), then \(x \propto z\)

Induced autocorrelation

Recall the transitive property, whereby

If \(A = B\) and \(B = C\), then \(A = C\)

which suggests that

If \(x \propto y\) and \(y \propto z\), then \(x \propto z\)

and thus

If \(x_t \propto x_{t+1}\) and \(x_{t+1} \propto x_{t+2}\), then \(x_t \propto x_{t+2}\)

Partial autocorrelation funcion (PACF)

The partial autocorrelation function (\(\phi_k\)) measures the correlation between a series \(x_t\) and \(x_{t+k}\) with the linear dependence of \(\{x_{t-1},x_{t-2},\dots,x_{t-k-1}\}\) removed

Partial autocorrelation funcion (PACF)

The partial autocorrelation function (\(\phi_k\)) measures the correlation between a series \(x_t\) and \(x_{t+k}\) with the linear dependence of \(\{x_{t-1},x_{t-2},\dots,x_{t-k-1}\}\) removed

We can estimate \(\phi_k\) from a sample as

\[ \phi_k = \begin{cases} \text{Cor}(x_1,x_0) = \rho_1 & \text{if } k = 1 \\ \text{Cor}(x_k-x_k^{k-1}, x_0-x_0^{k-1}) & \text{if } k \geq 2 \end{cases} \]

\[ x_k^{k-1} = \beta_1 x_{k-1} + \beta_2 x_{k-2} + \dots + \beta_{k-1} x_1 \]

\[ x_0^{k-1} = \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_{k-1} x_{k-1} \]

Lake Washington phytoplankton

Lake Washington phytoplankton

Autocorrelation

Autocorrelation

Lake Washington phytoplankton

Partial autocorrelation

Partial autocorrelation

ACF & PACF in model selection

We will see that the ACF & PACF are very useful for identifying the orders of ARMA models

Cross-covariance function (CCVF)

Often we want to look for relationships between 2 different time series

We can extend the notion of covariance to cross-covariance

Cross-covariance function (CCVF)

Often we want to look for relationships between 2 different time series

We can extend the notion of covariance to cross-covariance

We can estimate the CCVF \((g^{x,y}_k)\) from a sample as

\[ g^{x,y}_k = \frac{1}{N}\sum_{t=1}^{N-k}{(x_t - m_x)(y_{t+k} - m_y)} \]

Cross-correlation function (CCF)

The cross-correlation function is the CCVF normalized by the standard deviations of x & y

\[ r^{x,y}_k = \frac{g^{x,y}_k}{s_x s_y} \]

Just as with other measures of correlation

\[ -1 \leq r^{x,y}_k \leq 1 \]

Estimating the CCF in R

ccf(x, y)

Note: the lag k value returned by ccf(x, y) is the correlation between x[t+k] and y[t]

In an explanatory context, we often think of \(y = f(x)\), so it’s helpful to use ccf(y, x) and only consider positive lags

Example of cross-correlation